πŸ“š Chapters

Design & Analysis of Algorithms

Unit 1 - Understanding Fundamentals

πŸ“– Full Notes
⚑ Last Min Notes
🎯 Chapter 1 β€” Algorithm and Program Performance
πŸ“Š ALGORITHM PERFORMANCE - MIND MAP
Algorithm β†’ Step-by-step procedure to solve a problem
Time Complexity β†’ How running time grows with input size
Space Complexity β†’ How memory usage grows with input size
Analysis Types β†’ Best Case, Average Case, Worst Case
Asymptotic Notations β†’ Big O, Omega, Theta (growth comparison)
Recurrence β†’ Substitution, Recursion Tree, Master Method
1. What is an Algorithm?
Definition:
An algorithm is a step-by-step procedure or set of rules designed to solve a specific problem or perform a specific task. Think of it as a recipe for solving a problem.
Real-life Example - Making Tea:
Step 1: Boil water
Step 2: Add tea leaves
Step 3: Add sugar
Step 4: Add milk
Step 5: Strain and serve

This is an algorithm for making tea! Similarly, algorithms in programming solve computational problems.
Characteristics of a Good Algorithm:
1. Input: Algorithm should have zero or more inputs (data we provide)
2. Output: Must produce at least one output (result)
3. Definiteness: Each step must be clear and unambiguous (no confusion)
4. Finiteness: Must terminate after finite number of steps (should not run forever)
5. Effectiveness: Each step must be simple enough to be executed
6. Correctness: Should produce correct output for all valid inputs
Example Algorithm - Find Maximum of Two Numbers: Algorithm: FindMaximum Input: Two numbers a and b Output: Maximum number Step 1: Start Step 2: Read values of a and b Step 3: If a > b then max = a Else max = b Step 4: Print max Step 5: Stop
Algorithm vs Program:
β€’ Algorithm: Language-independent solution (written in plain English or pseudocode)
β€’ Program: Implementation of algorithm in a specific programming language (Java, C++, Python)
2. Why Study Algorithms?
βœ“ Efficiency: Helps write faster programs (less time)
βœ“ Optimization: Reduces memory usage (less space)
βœ“ Problem Solving: Improves logical thinking and coding skills
βœ“ Scalability: Ensures program works well with large data
βœ“ Career: Important for interviews and competitive programming
Real-life Impact:
β€’ Google Search uses algorithms to find results in milliseconds from billions of web pages
β€’ GPS uses algorithms to find shortest route
β€’ Netflix uses algorithms to recommend shows
β€’ Banks use algorithms for fraud detection
3. Time Complexity
What is Time Complexity?
Time complexity is a measure of the amount of time an algorithm takes to run as a function of the input size. It tells us how the running time grows when the input size increases.

Important: We don't measure actual time in seconds (that depends on computer speed). Instead, we count the number of operations performed.

Simple Example - Printing Numbers:

Algorithm 1: Print numbers 1 to n
for (i = 1 to n)
  print i

Operations: n times (one print for each number)
Time Complexity: O(n) - grows linearly with input

Algorithm 2: Print all pairs
for (i = 1 to n)
  for (j = 1 to n)
    print (i, j)

Operations: n Γ— n = nΒ² times
Time Complexity: O(nΒ²) - grows quadratically
Common Time Complexities (from fastest to slowest):
Time Complexity Hierarchy
πŸ“Š Time Complexity Hierarchy
Remember:
β€’ Lower time complexity = Faster algorithm
β€’ We always analyze for large input sizes (as n β†’ ∞)
β€’ We ignore constants (2n and 5n both are O(n))
4. Space Complexity
What is Space Complexity?
Space complexity is the amount of memory space required by an algorithm to run as a function of the input size. It includes both auxiliary space and space used by input.
Space Complexity = Auxiliary Space + Input Space
Example 1 - Sum of Array Elements:
int sum = 0;
for (i = 0 to n-1)
  sum = sum + arr[i]


Variables used: sum, i (only 2 variables)
Space Complexity: O(1) - constant space
(Space doesn't grow with input size)
Example 2 - Copy Array:
Create new array temp[n]
for (i = 0 to n-1)
  temp[i] = arr[i]


Extra array of size n created
Space Complexity: O(n) - linear space
(Space grows with input size)
Time vs Space Trade-off:
Sometimes we can make algorithm faster by using more memory, or save memory by taking more time. This is called time-space trade-off.
5. Average and Worst Case Analysis
What is Case Analysis?
Case analysis means analyzing an algorithm's performance under different scenarios. We typically analyze three cases: Best Case, Average Case, and Worst Case.
1. Best Case Analysis
Best Case: The scenario where the algorithm performs the minimum number of operations. This is the most optimistic situation.
Example - Linear Search:
Task: Find element 'x' in array [10, 20, 30, 40, 50]
If x = 10 (first element), we find it in just 1 comparison.
Best Case Time: O(1) - constant time
2. Average Case Analysis
Average Case: The expected performance when considering all possible inputs. We calculate the average number of operations across all scenarios.
Example - Linear Search:
Array: [10, 20, 30, 40, 50]
Element could be at any position (or not present)
Average comparisons = (1 + 2 + 3 + 4 + 5) / 5 = 3
Average Case Time: O(n) - linear time
3. Worst Case Analysis
Worst Case: The scenario where the algorithm performs the maximum number of operations. This is the most pessimistic situation and most commonly used for analysis.
Example - Linear Search:
Array: [10, 20, 30, 40, 50]
If x = 50 (last element) or x is not present, we check all n elements.
Worst Case Time: O(n) - linear time
Complete Asymptotic Notations Comparison
πŸ“Š Complete Asymptotic Notations Comparison
Why Worst Case is Important:
β€’ Guarantees algorithm won't perform worse than this
β€’ Useful for critical systems (medical, aviation, banking)
β€’ Provides upper bound on running time
β€’ Most commonly used in analysis
Detailed Example - Insertion Sort:

Best Case: Array is already sorted [1, 2, 3, 4, 5]
β†’ Only n-1 comparisons needed β†’ O(n)

Average Case: Array is randomly ordered
β†’ Average of nΒ²/4 comparisons β†’ O(nΒ²)

Worst Case: Array is reverse sorted [5, 4, 3, 2, 1]
β†’ Maximum comparisons: 1+2+3+...+(n-1) = n(n-1)/2 β†’ O(nΒ²)
6. Asymptotic Notations
What are Asymptotic Notations?
Asymptotic notations are mathematical tools used to describe the running time of an algorithm when the input size approaches infinity. They help us compare algorithms by ignoring constant factors and focusing on growth rate.

Why "Asymptotic"? We analyze how the algorithm behaves as input size (n) becomes very large (tends to infinity).

Three Main Asymptotic Notations:
1. Big O (O) - Upper Bound (Worst Case)
2. Big Omega (Ξ©) - Lower Bound (Best Case)
3. Big Theta (Θ) - Tight Bound (Average Case)
7. Big O Notation (Upper Bound)
Big O Notation:
Big O describes the upper bound of an algorithm's running time. It gives the worst-case scenario - the maximum time an algorithm could take.

Think of it as: "The algorithm will take AT MOST this much time"
f(n) = O(g(n)) If there exist positive constants c and nβ‚€ such that: f(n) ≀ c Γ— g(n) for all n β‰₯ nβ‚€
Simple Explanation:
If an algorithm takes 3nΒ² + 5n + 2 operations:
β€’ For large n, nΒ² term dominates
β€’ We ignore constants (3, 5, 2)
β€’ We ignore lower order terms (5n, 2)
β€’ Big O = O(nΒ²)

This means the algorithm won't take more than nΒ² time (ignoring constants).
Big O, Omega, Theta Comparison
πŸ“Š Big O, Omega, Theta Comparison
Real Code Example:
for (i = 0; i < n; i++) {
  for (j = 0; j < n; j++) {
    print(i, j);
  }
}


Outer loop: n times
Inner loop: n times for each outer iteration
Total operations: n Γ— n = nΒ²
Big O = O(nΒ²)
Important Properties:
β€’ O(n) + O(n) = O(n) - we keep the higher order
β€’ O(n) Γ— O(n) = O(nΒ²) - we multiply
β€’ O(2n) = O(n) - constants are dropped
β€’ O(nΒ² + n) = O(nΒ²) - lower order terms dropped
8. Big Omega Notation (Ξ©) - Lower Bound
Big Omega Notation:
Big Omega describes the lower bound of an algorithm's running time. It gives the best-case scenario - the minimum time an algorithm could take.

Think of it as: "The algorithm will take AT LEAST this much time"
f(n) = Ξ©(g(n)) If there exist positive constants c and nβ‚€ such that: f(n) β‰₯ c Γ— g(n) for all n β‰₯ nβ‚€
Example - Linear Search:
Best case: Element found at first position
Time taken: 1 comparison (constant)
Ξ©(1) - Will take at least constant time

Even in the best case, we must do at least 1 comparison, so Ξ©(1).
Example - Merge Sort:
Even in the best case (already sorted array), merge sort has to:
β€’ Divide the array: log n levels
β€’ Merge at each level: n operations
Total: n log n operations
Ξ©(n log n) - Will take at least n log n time
Remember:
β€’ Big O = Upper bound (worst case - maximum time)
β€’ Big Omega = Lower bound (best case - minimum time)
β€’ Big O is more commonly used in practice
9. Big Theta Notation (Θ) - Tight Bound
Big Theta Notation:
Big Theta describes the tight bound of an algorithm's running time. It means the algorithm's running time is bounded both from above and below by the same function.

Think of it as: "The algorithm will take EXACTLY this much time (within constant factors)"
f(n) = Θ(g(n)) If f(n) = O(g(n)) AND f(n) = Ω(g(n)) Then f(n) = Θ(g(n))
Simple Explanation:
If an algorithm has:
β€’ Best case = O(nΒ²)
β€’ Worst case = O(nΒ²)
Then we can say it's Θ(n²) (tight bound)

This means the algorithm always takes around nΒ² time, regardless of input.
Example - Merge Sort:
Best case: Ξ©(n log n)
Worst case: O(n log n)
Since both are same: Θ(n log n)

Merge sort always takes n log n time, whether array is sorted, reverse sorted, or random.
Example - Simple Loop:
for (i = 0; i < n; i++) {
  print(i);
}


This loop ALWAYS runs exactly n times.
Best case = n, Worst case = n
Θ(n) - tight bound
Examples of Big O Notation
πŸ“Š Examples of Big O Notation
10. Asymptotic Notations - Detailed Comparison
Case Analysis Summary for Common Algorithms
πŸ“Š Case Analysis Summary for Common Algorithms
Practical Example - Binary Search:

Best Case (Ξ©):
Element found in middle on first try
Ξ©(1) - at least constant time

Worst Case (O):
Element at end or not present
Need to divide array log n times
O(log n) - at most logarithmic time

Average Case (Θ):
On average, takes log n comparisons
Θ(log n) - exactly logarithmic time
When to Use Which:
β€’ Use Big O when analyzing worst-case (most common)
β€’ Use Big Omega when proving lower bounds
β€’ Use Big Theta when best and worst cases are same
β€’ In practice, "Big O" is used most frequently
11. Common Mistakes in Complexity Analysis
⚠️ Mistake 1: Not Dropping Constants
❌ Wrong: O(2n) or O(5n²)
βœ“ Correct: O(n) and O(nΒ²)
Constants are always dropped!

⚠️ Mistake 2: Not Dropping Lower Order Terms
❌ Wrong: O(n² + n)
βœ“ Correct: O(nΒ²)
Keep only the highest order term!

⚠️ Mistake 3: Confusing Best/Worst Case with Big O/Omega
β€’ Best case can still have Big O notation
β€’ Worst case can still have Big Omega notation
β€’ They are different concepts!

⚠️ Mistake 4: Thinking O(n) is Always Better than O(n²)
For small n, O(nΒ²) with small constant might be faster
Example: O(nΒ²) algorithm with 2nΒ² operations vs O(n) with 1000n operations
For n < 500, the O(nΒ²) is actually faster!
12. Practice: Finding Time Complexity
Example 1: Simple Loop
for (i = 0; i < n; i++) {
  print(i);
}

Loop runs n times β†’ O(n)
Example 2: Nested Loops (Same Limit)
for (i = 0; i < n; i++) {
  for (j = 0; j < n; j++) {
    print(i, j);
  }
}

Outer: n times, Inner: n times β†’ n Γ— n β†’ O(nΒ²)
Example 3: Nested Loops (Different Limits)
for (i = 0; i < n; i++) {
  for (j = 0; j < m; j++) {
    print(i, j);
  }
}

Outer: n times, Inner: m times β†’ n Γ— m β†’ O(nΓ—m)
Example 4: Loop with Division
for (i = n; i > 0; i = i/2) {
  print(i);
}

Each iteration divides by 2
n β†’ n/2 β†’ n/4 β†’ ... β†’ 1
Takes logβ‚‚(n) iterations β†’ O(log n)
Example 5: Sequential Code
for (i = 0; i < n; i++) {
  print(i);
}
for (j = 0; j < n; j++) {
  print(j);
}

First loop: O(n), Second loop: O(n)
Total: O(n) + O(n) = O(2n) = O(n)
Example 6: Dependent Nested Loop
for (i = 0; i < n; i++) {
  for (j = 0; j < i; j++) {
    print(i, j);
  }
}

When i=0: 0 times, i=1: 1 time, i=2: 2 times, ..., i=n-1: n-1 times
Total: 0+1+2+...+(n-1) = n(n-1)/2 β†’ O(nΒ²)
πŸ” Chapter 2 β€” Recurrence Equations & Solutions
πŸ“Š RECURRENCE - MIND MAP
Recurrence β†’ Function defined using smaller inputs
Substitution β†’ Guess + Prove by induction
Recursion Tree β†’ Draw tree, sum level costs
Master Method β†’ T(n)=aT(n/b)+f(n) β†’ 3 cases
1. Recurrence Relations
Definition: An equation that defines a function in terms of itself with smaller inputs.
Common Examples: Binary Search: T(n) = T(n/2) + c Merge Sort: T(n) = 2T(n/2) + cn Factorial: T(n) = T(n-1) + c
2. Substitution Method
Step 1: Guess solution
Step 2: Prove by induction
Example: T(n) = 2T(n/2) + n
Guess: T(n) = cn log n
Proof: T(n) = 2[c(n/2)log(n/2)] + n = cn log n βœ“
Result: Θ(n log n)
3. Recursion Tree Method
T(n) = 2T(n/2) + n
Level 0: n β†’ n
Level 1: n/2, n/2 β†’ n
Level 2: 4Γ—(n/4) β†’ n
Height: log n
Total: n Γ— log n
Result: Θ(n log n)
4. Master Method
T(n) = aT(n/b) + f(n) Compare f(n) with n^(log_b a)
Case 1: f(n) < n^(log a) β†’ T(n)=Θ(n^log a)
Case 2: f(n) = n^(log a) β†’ T(n)=Θ(n^log a Γ— log n)
Case 3: f(n) > n^(log a) β†’ T(n)=Θ(f(n))
Examples: T(n)=2T(n/2)+n β†’ Case 2 β†’ Θ(n log n) T(n)=T(n/2)+1 β†’ Case 2 β†’ Θ(log n) T(n)=2T(n/2)+nΒ² β†’ Case 3 β†’ Θ(nΒ²)
πŸ“¦ Chapter 3 β€” Review of Data Structures
πŸ“Š DATA STRUCTURES - MIND MAP
Linear β†’ Arrays, Stacks, Queues, Linked Lists
Arrays β†’ Fixed size, contiguous memory, O(1) access
Stack β†’ LIFO (Last In First Out), Push/Pop operations
Queue β†’ FIFO (First In First Out), Enqueue/Dequeue
Linked List β†’ Dynamic, nodes with pointers (Singly, Doubly, Circular)
Hashing β†’ Key-value mapping, O(1) average search
1. Arrays
What is an Array?
An array is a collection of elements of the same data type stored in contiguous memory locations. Each element can be accessed using an index.
Real-life Example:
Think of a row of lockers in school. Each locker has a number (index) and stores items (data). You can directly go to locker #5 without checking lockers 1-4.
Characteristics:
βœ“ Fixed Size: Size must be declared at creation
βœ“ Random Access: Access any element in O(1) time
βœ“ Contiguous Memory: Elements stored in consecutive locations
βœ“ Same Data Type: All elements must be of same type
Time Complexity: Operation | Time Complexity ------------------ | --------------- Access (by index) | O(1) Search | O(n) Insert (at end) | O(1) Insert (at middle) | O(n) Delete | O(n)
Key Points:
β€’ Index starts from 0
β€’ arr[i] gives element at index i
β€’ Memory address: base_address + (i Γ— size_of_element)
2. Stacks
What is a Stack?
A stack is a linear data structure that follows the LIFO (Last In First Out) principle. The last element inserted is the first one to be removed.
Real-life Examples:
β€’ Stack of plates: You add/remove from the top only
β€’ Undo operation in editors: Last action is undone first
β€’ Browser back button: Last page visited is shown first
Basic Operations:
1. Push: Add element to top of stack
2. Pop: Remove element from top of stack
3. Peek/Top: View top element without removing
4. isEmpty: Check if stack is empty
5. isFull: Check if stack is full (for fixed-size arrays)
Stack Operations - Pseudocode: Push(item): if stack is full: return "Stack Overflow" else: top = top + 1 stack[top] = item Pop(): if stack is empty: return "Stack Underflow" else: item = stack[top] top = top - 1 return item Peek(): if stack is empty: return "Stack is empty" else: return stack[top]
Time Complexity: Operation | Time --------- | ---- Push | O(1) Pop | O(1) Peek | O(1) Search | O(n)
Applications of Stack:
β€’ Function call management (recursion)
β€’ Expression evaluation (infix to postfix)
β€’ Backtracking algorithms
β€’ Undo/Redo operations
β€’ Syntax parsing
3. Queues
What is a Queue?
A queue is a linear data structure that follows the FIFO (First In First Out) principle. The first element inserted is the first one to be removed.
Real-life Examples:
β€’ Line at ticket counter: First person in line is served first
β€’ Print queue: First document sent is printed first
β€’ CPU scheduling: Processes wait in queue
Basic Operations:
1. Enqueue: Add element to rear (back) of queue
2. Dequeue: Remove element from front of queue
3. Front: View front element without removing
4. Rear: View last element
5. isEmpty: Check if queue is empty
Queue Operations - Pseudocode: Enqueue(item): if queue is full: return "Queue Overflow" else: rear = rear + 1 queue[rear] = item Dequeue(): if queue is empty: return "Queue Underflow" else: item = queue[front] front = front + 1 return item
Time Complexity: Operation | Time --------- | ---- Enqueue | O(1) Dequeue | O(1) Front | O(1) Search | O(n)
Applications of Queue:
β€’ CPU scheduling
β€’ Printer spooling
β€’ BFS (Breadth-First Search) in graphs
β€’ Handling requests in web servers
β€’ Call center systems
4. Pointers
What is a Pointer?
A pointer is a variable that stores the memory address of another variable. It "points to" the location where data is stored.
Real-life Example:
Think of a pointer as a sticky note with an address written on it. The note doesn't contain your friend's house, but it tells you where to find it.
Pointer Basics: int x = 10; // Normal variable int *ptr = &x; // Pointer storing address of x // &x gives address of x // *ptr gives value at address (dereferencing) printf("%d", x); // Prints 10 printf("%d", *ptr); // Prints 10 (value at address) printf("%p", ptr); // Prints address
Why Pointers?
β€’ Dynamic memory allocation
β€’ Implement linked lists, trees, graphs
β€’ Pass large structures efficiently
β€’ Enable call by reference
5. Linked Lists
What is a Linked List?
A linked list is a linear data structure where elements (nodes) are connected using pointers. Each node contains data and a pointer to the next node.
Real-life Example:
Think of a treasure hunt where each clue points to the next location. You follow the chain of clues to reach the treasure.
Node Structure:
Basic Node: struct Node { int data; // Data part struct Node* next; // Pointer to next node };
Types of Linked Lists:
1. Singly Linked List (One-way)
Each node points to the next node only. Traversal possible in one direction (forward only).
Singly Linked List Structure: [Data|Next] β†’ [Data|Next] β†’ [Data|Next] β†’ NULL Example: 10 β†’ 20 β†’ 30 β†’ NULL Operations: - Insert at beginning: O(1) - Insert at end: O(n) - Delete: O(n) - Search: O(n)
2. Doubly Linked List (Two-way)
Each node has two pointers: one to the next node and one to the previous node. Traversal possible in both directions.
Doubly Linked List Structure: NULL ← [Prev|Data|Next] ↔ [Prev|Data|Next] ↔ [Prev|Data|Next] β†’ NULL Example: NULL ← 10 ↔ 20 ↔ 30 β†’ NULL Node Structure: struct Node { int data; struct Node* prev; struct Node* next; }; Advantages: - Backward traversal possible - Delete operation easier (no need to track previous node) Disadvantages: - Extra memory for prev pointer - More complex insertion/deletion
3. Circular Linked List (Two-way Circular)
Last node points back to the first node, forming a circle. Can be singly or doubly linked.
Circular Linked List
πŸ“Š Circular Linked List - Singly & Doubly
Linked List Comparison
πŸ“Š Comparison: Singly vs Doubly vs Circular Linked Lists
Linked List vs Array:

Linked List Advantages:
β€’ Dynamic size (grow/shrink as needed)
β€’ Easy insertion/deletion (just change pointers)
β€’ No memory waste

Linked List Disadvantages:
β€’ No random access (must traverse from head)
β€’ Extra memory for pointers
β€’ Cache unfriendly (not contiguous memory)
6. Hashing
What is Hashing?
Hashing is a technique to map data of any size to fixed-size values using a hash function. It enables fast data retrieval using keys.

Hash Function: Function that converts input (key) into a hash value (index)
Hash Table: Data structure that stores key-value pairs using hashing
Collision: When two keys map to the same index
Hash Function Methods:
1. Division Method:
h(k) = k mod m

where: k = key, m = table size

Example: key=25, m=10 β†’ h(25) = 25 % 10 = 5
2. Multiplication Method:
h(k) = ⌊m Γ— (k Γ— A mod 1)βŒ‹

where: A β‰ˆ 0.618 (golden ratio), m = table size

Example: k=123, A=0.618, m=100
h(123) = ⌊100 Γ— (123Γ—0.618 mod 1)βŒ‹ = 1
3. Mid-Square Method:
Steps: Square key β†’ Extract middle digits β†’ Use as hash

Example: key=60, table size=100
60Β² = 3600 β†’ Middle digits = 60 β†’ h(60) = 60
4. Folding Method:
Steps: Divide key into parts β†’ Add parts β†’ Apply mod

Example: key=123456, m=100
Parts: 12 + 34 + 56 = 102
h(123456) = 102 % 100 = 2
Collision Resolution Techniques:
A. Open Addressing (Closed Hashing): Linear Probing: h(k,i) = (h(k) + i) mod m Quadratic Probing: h(k,i) = (h(k) + c₁i + cβ‚‚iΒ²) mod m Double Hashing: h(k,i) = (h₁(k) + iΓ—hβ‚‚(k)) mod m B. Chaining (Open Hashing): Each slot points to linked list of colliding elements
πŸ’‘ Good Hash Function Properties:
β€’ Uniform distribution of keys
β€’ Fast computation
β€’ Minimize collisions
β€’ Use all hash table positions
Applications of Hashing:
β€’ Database indexing
β€’ Caching mechanisms
β€’ Password storage
β€’ Compiler symbol tables
β€’ Cryptographic hash functions
7. Trees
What is a Tree?
A tree is a hierarchical data structure consisting of nodes connected by edges. It has a root node and every other node is connected by exactly one path from the root.
Real-life Examples:
β€’ Family tree (genealogy)
β€’ Organization chart (company hierarchy)
β€’ File system (folders and files)
β€’ Decision trees
Tree Terminology:
Root: Top node (no parent)
Parent: Node with children
Child: Node connected below parent
Leaf: Node with no children
Height: Longest path from root to leaf
Depth: Distance from root to node
Degree: Number of children of a node
Binary Search Tree (BST)
What is BST?
A Binary Search Tree is a binary tree where:
β€’ Left subtree contains values less than parent
β€’ Right subtree contains values greater than parent
β€’ Both left and right subtrees are also BSTs
BST Example
🌳 Binary Search Tree Example with Properties
BST Operations Complexity
πŸ“Š BST Operations - Time Complexity Analysis
BST Search Example:
Search for 60 in the tree above:
1. Start at root (50): 60 > 50, go right
2. At 70: 60 < 70, go left
3. Found 60! β†’ 2 comparisons

Much faster than linear search (would need 5 comparisons)
B-Tree
What is B-Tree?
A B-Tree is a self-balancing tree where each node can have multiple keys and children. It's optimized for systems that read/write large blocks of data (like databases).
B-Tree Properties (order m):
β€’ Each node has at most m children
β€’ Each node (except root) has at least m/2 children
β€’ Root has at least 2 children (unless it's a leaf)
β€’ All leaves are at the same level
β€’ Keys in a node are sorted
Why B-Trees?
β€’ Used in databases and file systems
β€’ Minimize disk I/O operations
β€’ Keep tree height low (balanced)
β€’ Efficient for large datasets

Applications: MySQL databases, file systems (NTFS, ext4)
Balanced Trees (AVL & Red-Black Trees)
What are Balanced Trees?
Balanced trees are self-balancing BSTs that maintain O(log n) height, ensuring efficient operations even in worst case.
AVL Tree:
A self-balancing BST where the height difference between left and right subtrees (balance factor) is at most 1 for every node.

Balance Factor: height(left subtree) - height(right subtree)
Allowed values: -1, 0, 1
AVL Tree Operations: When balance factor becomes > 1 or < -1: Perform rotations to rebalance: 1. Left Rotation (LL) 2. Right Rotation (RR) 3. Left-Right Rotation (LR) 4. Right-Left Rotation (RL) Time Complexity: Search: O(log n) - GUARANTEED Insert: O(log n) Delete: O(log n)
Red-Black Tree:
A self-balancing BST where each node has a color (red or black) with specific properties maintained.
Red-Black Tree Properties:
1. Every node is either red or black
2. Root is always black
3. All leaves (NULL) are black
4. Red node cannot have red children
5. All paths from root to leaves have same number of black nodes
AVL vs Red-Black Tree
βš–οΈ AVL Tree vs Red-Black Tree - Complete Comparison
When to Use Which Tree?

BST: Simple applications, small datasets
AVL: Frequent searches, read-heavy operations
Red-Black: Frequent insertions/deletions
B-Tree: Large datasets, databases, file systems
8. Heaps
What is a Heap?
A heap is a complete binary tree that satisfies the heap property. It's primarily used for priority queues and heap sort.
Types of Heaps:
1. Max Heap:
Parent node is always greater than or equal to its children.
Root contains the maximum element.
Max Heap Example: 100 / \ 80 90 / \ / \ 70 60 50 40 Property: Parent β‰₯ Children Root (100) is the maximum element
2. Min Heap:
Parent node is always less than or equal to its children.
Root contains the minimum element.
Min Heap Example: 10 / \ 20 30 / \ / \ 40 50 60 70 Property: Parent ≀ Children Root (10) is the minimum element
Heap Operations:
1. Insert:
β€’ Add element at the end (bottom-right)
β€’ Heapify up (swap with parent if needed)
Time: O(log n)

2. Delete (Extract Max/Min):
β€’ Remove root element
β€’ Replace root with last element
β€’ Heapify down (swap with larger/smaller child)
Time: O(log n)

3. Get Max/Min:
β€’ Simply return root
Time: O(1)
Heap Implementation: Heaps are usually stored in arrays: For node at index i: - Left child: 2i + 1 - Right child: 2i + 2 - Parent: (i - 1) / 2 Example (Max Heap): Array: [100, 80, 90, 70, 60, 50, 40] Index: 0 1 2 3 4 5 6 100(0) / \ 80(1) 90(2) / \ / \ 70(3) 60(4) 50(5) 40(6)
Applications of Heaps:
β€’ Priority Queue implementation
β€’ Heap Sort algorithm
β€’ Dijkstra's shortest path algorithm
β€’ Finding k largest/smallest elements
β€’ Median maintenance in streaming data
Time Complexity Summary: Operation | Time Complexity --------------- | --------------- Build Heap | O(n) Insert | O(log n) Delete Max/Min | O(log n) Get Max/Min | O(1) Heapify | O(log n)
9. Graphs
What is a Graph?
A graph is a non-linear data structure consisting of vertices (nodes) and edges (connections). It represents relationships between objects.
Real-life Examples:
β€’ Social networks (Facebook friends)
β€’ Road networks (cities connected by roads)
β€’ Internet (web pages connected by links)
β€’ Flight routes (airports and flights)
Graph Terminology:
Vertex (Node): A point in the graph
Edge: Connection between two vertices
Adjacent: Two vertices connected by an edge
Degree: Number of edges connected to a vertex
Path: Sequence of vertices connected by edges
Cycle: Path that starts and ends at same vertex
Types of Graphs:
1. Directed Graph (Digraph):
Edges have direction (one-way). A→B is different from B→A.
Directed Graph Example: A β†’ B ↓ ↓ C ← D Edge (A,B) means you can go from A to B But cannot go from B to A (unless there's Bβ†’A edge)
2. Undirected Graph:
Edges have no direction (two-way). A-B means you can go both ways.
Undirected Graph Example: A β€” B | | C β€” D Edge (A,B) allows travel in both directions
3. Weighted Graph:
Edges have weights (costs/distances).
Weighted Graph Example: 5 3 A ---- B ---- C \ / 10 7 \ / D Used for: Shortest path problems, network routing
Graph Representation:
1. Adjacency Matrix:
2D array where matrix[i][j] = 1 if edge exists between vertex i and j.
Adjacency Matrix Example: Graph: 0 β€” 1 | | 2 β€” 3 Matrix: 0 1 2 3 0 [ 0 1 1 0 ] 1 [ 1 0 0 1 ] 2 [ 1 0 0 1 ] 3 [ 0 1 1 0 ] Space: O(VΒ²) Check edge: O(1) Find neighbors: O(V)
2. Adjacency List:
Array of lists. Each vertex has a list of its adjacent vertices.
Adjacency List Example: Graph: 0 β€” 1 | | 2 β€” 3 List: 0: [1, 2] 1: [0, 3] 2: [0, 3] 3: [1, 2] Space: O(V + E) Check edge: O(degree) Find neighbors: O(degree) Better for sparse graphs!
Adjacency Matrix vs List: Feature | Matrix | List --------------- | ----------- | ----------- Space | O(VΒ²) | O(V + E) Add edge | O(1) | O(1) Remove edge | O(1) | O(V) Check edge | O(1) | O(degree) Find neighbors | O(V) | O(degree) Use Matrix: Dense graphs, frequent edge checks Use List: Sparse graphs, less memory
Graph Traversal:
1. BFS (Breadth-First Search):
Visit all neighbors at current level before moving to next level. Uses Queue.
BFS Algorithm: 1. Start from source vertex 2. Mark it visited, add to queue 3. While queue not empty: - Remove vertex from queue - Visit all unvisited neighbors - Mark them visited, add to queue Time: O(V + E) Space: O(V) Applications: Shortest path in unweighted graph, level-order traversal
2. DFS (Depth-First Search):
Go as deep as possible along each branch before backtracking. Uses Stack/Recursion.
DFS Algorithm: 1. Start from source vertex 2. Mark it visited 3. Recursively visit all unvisited neighbors Time: O(V + E) Space: O(V) Applications: Cycle detection, topological sorting, maze solving
BFS vs DFS:

Use BFS when:
β€’ Finding shortest path
β€’ Level-by-level exploration needed
β€’ Solution is near source

Use DFS when:
β€’ Need to visit all nodes
β€’ Detecting cycles
β€’ Topological sorting
β€’ Solution is far from source
Common Graph Applications:
β€’ Google Maps (shortest path - Dijkstra's algorithm)
β€’ Social Network Analysis (friend recommendations)
β€’ Web Crawling (DFS to explore websites)
β€’ Dependency Resolution (topological sort)
β€’ Network Flow (maximum flow problems)
πŸ”’ Chapter 4 β€” Sorting in Linear Time
1. Linear Time Sorting Algorithms
What is Linear Time Sorting?
Sorting algorithms that can sort in O(n) or O(n+k) time, faster than comparison-based sorts (O(n log n) lower bound).
Key Difference:
β€’ Comparison sorts: Compare elements (Merge Sort, Quick Sort) β†’ O(n log n)
β€’ Non-comparison sorts: Use different approach (Counting, Radix, Bucket) β†’ O(n)
2. Counting Sort
What is Counting Sort?
Counting Sort counts the occurrences of each element and uses this information to place elements in sorted order. Works well when range of input is small.
Algorithm Steps: 1. Find the range (max - min) 2. Create count array of size (max + 1) 3. Count occurrences of each element 4. Modify count array to store cumulative sum 5. Place elements in output array using count array 6. Copy output to original array Time Complexity: O(n + k) where k is range Space Complexity: O(k)
Example: Sort [4, 2, 2, 8, 3, 3, 1]

Step 1: Count occurrences
Count array: [0, 1, 2, 2, 1, 0, 0, 0, 1]
Index: [0, 1, 2, 3, 4, 5, 6, 7, 8]

Step 2: Cumulative sum
Count array: [0, 1, 3, 5, 6, 6, 6, 6, 7]

Step 3: Place elements
Output: [1, 2, 2, 3, 3, 4, 8]

Result: Sorted in O(n + k) time!
Advantages:
βœ“ Linear time O(n + k)
βœ“ Stable sorting (maintains relative order)
βœ“ Simple to implement

Disadvantages:
βœ— Not suitable for large range (k >> n)
βœ— Requires extra space O(k)
βœ— Only works with integers
When to Use Counting Sort?
β€’ Small range of integers
β€’ Need stable sort
β€’ Example: Sorting ages (0-100), grades (0-100)
3. Radix Sort
What is Radix Sort?
Radix Sort sorts numbers digit by digit, starting from least significant digit (LSD) to most significant digit (MSD). Uses Counting Sort as subroutine.
Algorithm Steps: 1. Find the maximum number to know number of digits 2. Starting from least significant digit: - Sort elements based on current digit using Counting Sort - Move to next digit 3. Repeat until all digits processed Time Complexity: O(d Γ— (n + k)) where d = number of digits, k = range (usually 10 for decimal) Space Complexity: O(n + k)
Example: Sort [170, 45, 75, 90, 802, 24, 2, 66]

Max = 802 (3 digits), so d = 3

Pass 1 (Ones place):
Sort by last digit: 0, 2, 4, 5, 6, 0, 2, 5
Result: [170, 90, 802, 2, 24, 45, 75, 66]

Pass 2 (Tens place):
Sort by middle digit: 7, 9, 0, 0, 2, 4, 7, 6
Result: [802, 2, 24, 45, 66, 170, 75, 90]

Pass 3 (Hundreds place):
Sort by first digit: 8, 0, 0, 0, 0, 1, 0, 0
Final Result: [2, 24, 45, 66, 75, 90, 170, 802]
Advantages:
βœ“ Fast for numbers with fixed number of digits
βœ“ Stable sorting
βœ“ Better than comparison sorts for large datasets

Disadvantages:
βœ— Slower for small datasets
βœ— Requires stable sort as subroutine
βœ— Not in-place (needs extra space)
When to Use Radix Sort?
β€’ Large list of integers
β€’ Fixed number of digits
β€’ Example: Sorting phone numbers, student IDs, ZIP codes
4. Bucket Sort
What is Bucket Sort?
Bucket Sort distributes elements into several buckets, sorts each bucket individually, then concatenates all buckets. Works well for uniformly distributed data.
Algorithm Steps: 1. Create n empty buckets 2. Distribute elements into buckets based on range 3. Sort each bucket individually (using insertion sort) 4. Concatenate all buckets in order Time Complexity: Average: O(n + k) where k is number of buckets Worst: O(nΒ²) when all elements go to one bucket Space Complexity: O(n + k)
Example: Sort [0.78, 0.17, 0.39, 0.26, 0.72, 0.94, 0.21, 0.12]

Assume 5 buckets (range 0-1)

Step 1: Distribute to buckets
Bucket 0 [0.0-0.2]: [0.17, 0.12]
Bucket 1 [0.2-0.4]: [0.26, 0.21, 0.39]
Bucket 2 [0.4-0.6]: []
Bucket 3 [0.6-0.8]: [0.78, 0.72]
Bucket 4 [0.8-1.0]: [0.94]

Step 2: Sort each bucket
Bucket 0: [0.12, 0.17]
Bucket 1: [0.21, 0.26, 0.39]
Bucket 2: []
Bucket 3: [0.72, 0.78]
Bucket 4: [0.94]

Step 3: Concatenate
Final: [0.12, 0.17, 0.21, 0.26, 0.39, 0.72, 0.78, 0.94]
Advantages:
βœ“ Fast for uniformly distributed data
βœ“ Simple to understand
βœ“ Can use any sorting algorithm for buckets

Disadvantages:
βœ— Performance depends on distribution
βœ— Extra space required
βœ— Not efficient for large ranges
When to Use Bucket Sort?
β€’ Data is uniformly distributed
β€’ Floating point numbers in range
β€’ Example: Sorting decimal numbers, histogram data
5. Sorting Algorithms Comparison
Complete Sorting Comparison Table
πŸ“Š Complete Comparison of All Sorting Algorithms
⚑ Last Minute Notes - Quick Revision (5-10 mins)
πŸ“Œ How to Use: Read this 5-10 minutes before exam. Contains all formulas, key concepts, and commonly asked topics in condensed form.
⏱️ Time & Space Complexity
O(1) < O(log n) < O(n) < O(n log n) < O(nΒ²) < O(2ⁿ) < O(n!) Remember: β€’ Drop constants: O(2n) = O(n) β€’ Keep highest: O(nΒ² + n) = O(nΒ²) β€’ Nested loops multiply: O(n) Γ— O(n) = O(nΒ²)
πŸ“Š Case Analysis
Best Case (Ξ©): Minimum time (at least)
Average Case (Θ): Expected time (exactly)
Worst Case (O): Maximum time (at most)

Most important: Worst Case
πŸ“ˆ Asymptotic Notations (Very Important!)
Big O (O) β†’ Upper bound (≀) Big Omega (Ξ©) β†’ Lower bound (β‰₯) Big Theta (Θ) β†’ Tight bound (=) Master Method: T(n) = aT(n/b) + f(n) Calculate: n^(log_b a) Case 1: f(n) < β†’ Answer: Θ(n^log_b a) Case 2: f(n) = β†’ Answer: Θ(n^log_b a Γ— log n) Case 3: f(n) > β†’ Answer: Θ(f(n))
πŸ”„ Recurrence Quick Reference
T(n) = T(n/2) + c β†’ O(log n) Binary Search T(n) = 2T(n/2) + n β†’ O(n log n) Merge Sort T(n) = T(n-1) + c β†’ O(n) Linear T(n) = T(n-1) + n β†’ O(nΒ²) Quadratic
πŸ“¦ Data Structures Cheat Sheet
Stack: LIFO | Push/Pop O(1) Queue: FIFO | Enqueue/Dequeue O(1) Array: Random access O(1) | Search O(n) Linked List: Insert O(1) | Access O(n) Hash Table: Search O(1) avg | Collision resolution BST: Search/Insert/Delete O(log n) avg, O(n) worst AVL: All operations O(log n) guaranteed Heap: Insert/Delete O(log n) | Get max O(1)
🌳 Trees Quick Points
BST: Left < Parent < Right
AVL: Balance factor ∈ {-1, 0, 1}
Red-Black: Root black, no red-red parent-child
B-Tree: Multi-way, all leaves same level
Heap: Max: Parent β‰₯ Children, Min: Parent ≀ Children
πŸ•ΈοΈ Graph Essentials
Representation: β€’ Matrix: O(VΒ²) space, O(1) edge check β€’ List: O(V+E) space, better for sparse Traversal: β€’ BFS: Queue, level-order, shortest path β€’ DFS: Stack/Recursion, deep exploration Time: Both O(V + E)
πŸ”’ Sorting Algorithms
Linear Time (Non-comparison): Counting: O(n+k) | Small range integers Radix: O(d(n+k)) | Fixed digit numbers Bucket: O(n+k) avg | Uniform distribution Comparison-based: O(n log n) lower bound Merge: O(n log n) always, Stable Quick: O(n log n) avg, O(nΒ²) worst Heap: O(n log n), In-place
πŸ“ Important Formulas
1 + 2 + 3 + ... + n = n(n+1)/2 = O(nΒ²) 1 + 2 + 4 + ... + 2ⁿ = 2ⁿ⁺¹ - 1 = O(2ⁿ) log(ab) = log a + log b log(aᡇ) = b log a
⚠️ Common Exam Mistakes
❌ Confusing O with Ω with Θ
❌ Forgetting to drop constants
❌ Wrong Master Method case selection
❌ Mixing up Stack (LIFO) with Queue (FIFO)
❌ BST worst case is O(n), not O(log n)
❌ Heap stored in array: left=2i+1, right=2i+2
❌ DFS uses stack, BFS uses queue
❌ Counting Sort needs small range
🎯 Must Remember for Exam
βœ“ Time complexity hierarchy
βœ“ Master Theorem 3 cases
βœ“ Big O, Omega, Theta definitions
βœ“ Stack vs Queue operations
βœ“ BST, AVL, Red-Black properties
βœ“ BFS vs DFS (when to use)
βœ“ Adjacency Matrix vs List
βœ“ Counting/Radix/Bucket sort differences
βœ“ Array formulas (child/parent in heap)
πŸ’‘ Exam Strategy
For Theory Questions:
β€’ Define clearly
β€’ Give example
β€’ Draw diagram if possible
β€’ Mention time complexity

For Algorithm Questions:
β€’ Write steps clearly
β€’ Show example
β€’ Analyze complexity
β€’ Compare with alternatives

Time Management:
β€’ Start with easy questions
β€’ Attempt all questions
β€’ Keep 10 mins for revision
🌟 All the Best!
You've covered everything! DAA is about understanding patterns and complexity. Practice different problems, understand the logic, and you'll ace it! Remember: Every algorithm has a purpose. Good luck! πŸ’ͺπŸš€